Phoneme-based vector quantization in a discrete HMM speech recognizer

نویسندگان

Yaxin Zhang

Roberto Togneri

Michael D. Alder

چکیده

The quantization distortion of vector quantization (VQ) is a key element that affects the performance of a discrete hidden Markov modeling (DHMM) system. Many researchers have realized this problem and tried to use integrated feature or multiple codebook in their systems to offset the disadvantage of the conventional VQ. However the computational complexity of those systems is then increased. Investigations have shown that the speech signal space consists of finite clusters that represent phoneme data sets from male and female speakers and reveal Gaussian distributions. In this paper we propose an alternative VQ method in which the phoneme is treated as a cluster in the speech space and a Gaussian model is estimated for each phoneme. A Gaussian mixture model (GMM) is generated by the expectation-maximization (EM) algorithm for the whole speech space and used as a codebook in which each code word is a Gaussian model and represents a certain cluster. An input utterance would be classified as a certain phoneme or a set of phonemes only when the phoneme or phonemes gave highest likelihood. A typical discrete HMM system was used for both phoneme and isolated word recognition. The results show that the phoneme-based Gaussian modeling vector quantization classifies the speech space more effectively and significant improvements in the performance of the DHMM system have been achieved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support Vector Machines for Postprocessing of Speech Recognition Hypotheses

In this paper, we introduce an approach to improve the recognition performance of a Hidden Markov Model (HMM) based monophone recognizer using Support Vector Machines (SVMs). We developed and examined a method for re-scoring the HMM recognizer hypotheses by SVMs in a phoneme recognition framework. Compared to a stand-alone HMM system, an improvement of 9.2% was reached on the TIMIT database and...

متن کامل

Lvq as a Feature Transformation for Hmms

We present a new way to take advantage of the dis-criminative power of Learning Vector Quantization in combination with continuous density hidden Markov models. This is based on viewing LVQ as a non-linear feature transformation. Class-wise quantization errors of LVQ are modeled by continuous density HMMs, whereas the practice in the literature regarding LVQ/HMM hybrids is to use LVQ-codebooks ...

متن کامل

Phone vector DHMM to decode a phone recognizer's output

In this paper we introduce a Phone Vector Discrete HMM (PVDHMM) that decodes a phone recognizer’s output. The proposed PVDHMM treats a phone recognizer as a vector quantizer whose codebook size is equal to the size of its phone set. To examine the proposed method we perform two experiments. First, the output of a phone recognizer is recognized by the PVDHMM, and its results are compared with th...

متن کامل

Advanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems

This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classi cation in the continuous classi er framework is given and some constraints are derived that must hold for the pdfs in the discrete pattern classi er context. Furth...

متن کامل

Improving the performance of HMM-based very low bit rate speech coding

In this paper, we define an F0 quantization scheme for a very low bit rate speech coder based on HMM (Hidden Markov Model). In the coding system, the encoder carries out phoneme recognition, and transmits phoneme indices, state durations and F0 information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indices, and a sequence of mel-cepstral coefficient v...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEEE Trans. Speech and Audio Processing

دوره 5 شماره

صفحات -

تاریخ انتشار 1997

Phoneme-based vector quantization in a discrete HMM speech recognizer

نویسندگان

چکیده

منابع مشابه

Support Vector Machines for Postprocessing of Speech Recognition Hypotheses

Lvq as a Feature Transformation for Hmms

Phone vector DHMM to decode a phone recognizer's output

Advanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems

Improving the performance of HMM-based very low bit rate speech coding

عنوان ژورنال:

اشتراک گذاری